REESE: A Method of Soft Error Detection in Microprocessors

نویسندگان

  • Joel B. Nickel
  • Arun K. Somani
چکیده

Future reliability of general-purpose processors (GPPs) is threatened by a combination of shrinking transistor size, higher clock rates, reduced supply voltages, and other factors. It is predicted that the occurrence of arbitrary transient faults, or soft errors, wi l l dramatically increase as these trends continue. In this papec we develop and evaluate u fault-tolerant niicroprocessor architecture that detects soft errors in its own datu pipeline. This architecture acconiplishes soft error detection through tinie redundancy, while requiring little execution tinie overhead. Our approach, called REESE (REdundant Execution using Spare Elements), jirst niininiizes this overhead and then decreases it even further by strategically adding a m a l l niiniber of functional units to the pipeline. This differs from siniilar approaches in the past that have not addressed waj~s of reducing the overhead necessaqt to iniplenient tinie redundancy in GPPs.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Oware: Operand width Aware Redundant Execution for Whole-Processor Error Detection

As the feature size of semiconductor technology continues to shrink, high-performance microprocessors are increasingly susceptible to soft errors. Exploiting the fact that narrow-width values universally exist in applications, prior in-register duplication approaches for improving reliability of register file and other data-holding components mitigate performance cost but leave the rest of data...

متن کامل

Soft error tolerant Content Addressable Memories (CAMs) using error detection codes and duplication

Soft Errors are becoming a major concern for modern computing systems. Memories are one of the elements affected by soft errors, which cause bitflips in some of the cells. A number of techniques such as the use of Error Correction Codes (ECCs), interleaving or scrubbing are utilized to mitigate the effects of soft errors on memories. Content Addressable Memories (CAMs) pose additional challenge...

متن کامل

A Fault Detection Method for Combinational Circuits

As transistors become increasingly smaller and faster and noise margins become tighter, circuits and chip specially microprocessors tend to become more vulnerable to permanent and transient hardware faults. Most microprocessor designers focus on protecting memory elements among other parts of microprocessors against hardware faults through adding redundant error-correcting bits such as parity b...

متن کامل

An approach to fault detection and correction in design of systems using of Turbo ‎codes‎

We present an approach to design of fault tolerant computing systems. In this paper, a technique is employed that enable the combination of several codes, in order to obtain flexibility in the design of error correcting codes. Code combining techniques are very effective, which one of these codes are turbo codes. The Algorithm-based fault tolerance techniques that to detect errors rely on the c...

متن کامل

Checker Backend for Soft and Timing Error Detection and Recovery

Current microprocessors are becoming more vulnerable to cosmic particle strikes and parameter variations. Particle strikes may cause soft (transient) errors, whereas high variability (due to process, temperature and voltage) may transform non-critical paths into critical paths, resulting in timing errors. This paper proposes a design that exploits the benefits of clustering for detecting and re...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2001